Evolving Protein Motifs Using a Stochastic Regular Language with Codon-Level Probabilities

نویسندگان

  • Brian J Ross
  • Brian J. Ross
چکیده

Experiments involving the evolution of protein motifs using genetic programming are presented. The motifs use a stochastic regular expression language that uses codon-level probabilities within conserved sets (masks). Experiments compared basic genetic programming with Lamarckian evolution, as well as the use of “natural” probability distributions for masks obtained from the sequence database. It was found that Lamarckian evolution was detrimental to the probability performance of motifs. A comparison of evolved and natural mask probability schemes is inconclusive, since these strategies produce incompatible characterisations of motif fitness as used by the genetic programming system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Evaluation of a Stochastic Regular Motif Language for Protein Sequences

A probabilistic regular motif language for protein sequences is evaluated. SRE-DNA is a stochastic regular expression language that combines characteristics of regular expressions and stochastic representations such as Hidden Markov Models. To evaluate its expressive merits, genetic programming is used to evolve SRE-DNA motifs for aligned sets of protein sequences. Different constrained grammat...

متن کامل

Probabilistic Pattern Matching and the Evolutionof Stochastic Regular

The use of genetic programming for probabilistic pattern matching is investigated. A stochastic regular expression language is used. The language features a statistically sound semantics, as well as a syntax that promotes eecient manipulation by genetic programming operators. An algorithm for eecient string recognition based on approaches in conventional regular language recognition is used. Wh...

متن کامل

Finding Genes by Hidden Markov Models with a Protein Motif Dictionary

A new method for combining protein motif dictionary to gene nding system is proposed. The system consists of Hidden Markov Models (HMMs) and a dictionary. The HMMs represents the nucleotide acid bases, the codons, and the amino acids. The 'words' in the dictionary is described by the sequence of these HMMs and represent the noncoding regions, the codons, protein motifs, tRNA regions and signals...

متن کامل

Inferring stochastic regular grammars with recurrent neural networks

Recent work has shown that the extraction of symbolic rules improves the generalization performance of recurrent neural networks trained with complete (positive and negative) samples of regular languages. This paper explores the possibility of inferring the rules of the language when the network is trained instead with stochastic, positive-only data. For this purpose, a recurrent network with t...

متن کامل

RSMD-repeat searcher and motif detector

The functionality of a gene or a protein depends on codon repeats occurring in it. As a consequence of their vitality in protein function and apparent involvement in causing diseases, an interest in these repeats has developed in recent years. The analysis of genomic and proteomic sequences to identify such repeats requires some algorithmic support from informatics level. Here, we proposed an o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002